Dna construct for sequencing and method for preparing the same

ABSTRACT

A DNA construct comprising multiple units sequentially attached one to the other, wherein a unit comprises: a segment; an index attached to one end of the segment; an identifier attached to another end of the segment; an introducer attached to a 5′-end of either the index or the identifier, and a closure attached to a 5′-end of a remaining either identifier or index. A method for preparing the DNA construct and a method for analyzing a sequence of the DNA construct, as well as various embodiments thereof are disclosed herein.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Provisional Patent ApplicationNo. 62/505,920, filed 14 May 2017, the entire contents of which isincorporated herein by reference in its entirety.

FIELD

The present subject matter relates to DNA sequencing. More particularly,the present subject matter relates to the preparation of DNA forsequencing.

BACKGROUND

Analysis of DNA sequences of patients enables better diagnostics andwith that the ability to provide specific and better treatments forgenetically-based ailments. A DNA sequence of a whole genome or targetregions of an individual may be compared to known sequences of the humangenome in order to find variations that account for potential diseases,for example mutations that may cause cancer. Knowing and understandingthe genetic information of each patient with respect to specificailments help in preventing adverse events, allow for providingappropriate drug treatments and promote maximal efficacy with drugprescriptions.

The field of DNA sequencing has been rapidly advanced during the lastyears, enabling relatively rapid sequencing of very long DNA fragments,in the range of thousands and even more than substantially 100,000 bp.For example, nanopore sequencing is an advanced DNA sequencing methodthat provides a short, easy and fast procedure of sequencing librariesof very long DNA segments. This technology has the potential to offerrelatively low-cost genotyping, high mobility for testing, and rapidprocessing of samples with the ability to display results in real-time.An exemplary nanopore sequencing platform is MinIon (Oxford NanoporeTechnology Limited, UK).

Nanopore sequencing is configured to sequence very long DNA fragments,in the range of substantially 1,000-10,000 base pairs (bp) and even morethan substantially 100,000 bp. However, one drawback of nanoporesequencing is accuracy—substantially 90% accuracy. This is critical indiagnosing mutation-based diseases since there is no way to distinguishbetween mutations in the target sequence and errors in the sequencingthat may be interpreted as mutations in the target sequence. Inaddition, one of the ways to prepare DNA for nanopore sequencing isamplifying a region of interest (ROI) by polymerase chain reaction(PCR). It is well known in the art that during PCR errors in thesequence of the PCR product are introduced due to poor froofreading bythe DNA polymerase used in the PCR. These errors may also be interpretedas mutations in the target sequence. Furthermore, target DNA fragmentsthat are normally sequenced for example for genotyping and diagnostics,are relatively much shorter—in the range of a few hundred base pairs,compared to thousands of base pair sequenced by nanopore sequencing.This renders nanopore sequencing not suitable for sequencing short DNAfragments.

SUMMARY

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this subject matter belongs. Although methods andmaterials similar or equivalent to those described herein can be used inthe practice or testing of the present subject matter, suitable methodsand materials are described below. In case of conflict, the patentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting.

According to one aspect of the present subject matter, there is provideda DNA construct comprising multiple units sequentially attached one tothe other, wherein a unit comprises:

-   -   a segment;    -   an index attached to one end of the segment;    -   an identifier attached to another end of the segment;    -   an introducer attached to a 5′ -end of either the index or the        identifier, and    -   a closure attached to a 5′ -end of a remaining either identifier        or index.

According to one embodiment, the length of the DNA construct is at leastsubstantially 1,000 bp.

According to another embodiment, the length of the segment is up tosubstantially 1,000 bp.

According to yet another embodiment, the length of the segment is in therange of substantially 100-500 bp.

According to another aspect of the present subject matter, there isprovided a method for preparing A DNA construct, the method comprising:

-   -   obtaining a segment;    -   attaching an index, an identifier, an introducer and a closure        to the segment,    -   wherein    -   the index 14 attached to one end of the segment;    -   the identifier is attached to another end of the segment;    -   the introducer 18 is attached to a 5′-end either the index or        the identifier; and    -   the closure is attached to a 5′-end of a remaining either        identifier 16 or index 14, giving rise to a pre-mature unit 1;    -   amplifying the pre-mature unit with primers specific to the        introducer and closure, giving rise to a double stranded mature        unit 1;    -   phosphorylating 5′-ends of the strands of the mature unit 1, and        sequentially attaching mature units.

According to yet another aspect of the present subject matter, there isprovided a method for analyzing a sequence of a DNA construct accordingto claim 1, the method comprising:

-   -   separating sequences of units one from the other;    -   grouping units having the same index for obtaining same index        groups;    -   grouping units having the same segment sequence in each same        index group for obtaining same segment groups;    -   grouping units having the same identifier sequence in each same        segment group for obtaining same identifier 16 groups;    -   collapsing multiple segment sequences in each same identifier        group to a single sequence that accurately represents the        sequence of the target sequence according to which the segment        was obtained.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments are herein described, by way of example only, with referenceto the accompanying drawings. With specific reference now to thedrawings in detail, it is stressed that the particulars shown are by wayof example and for purposes of illustrative discussion of the preferredembodiments, and are presented in the cause of providing what isbelieved to be the most useful and readily understood description of theprinciples and conceptual aspects of the embodiments. In this regard, noattempt is made to show structural details in more detail than isnecessary for a fundamental understanding, the description taken withthe drawings making apparent to those skilled in the art how severalforms may be embodied in practice.

In the drawings:

FIGS. 1A-B schematically illustrate, according to some exemplaryembodiments, a unit of a DNA construct.

FIGS. 2A-B schematically illustrate, according to an exemplaryembodiment, a forward primer and a reverse primer, respectively, for afirst PCR.

FIG. 3 schematically illustrates, according to an exemplary embodiment,a DNA construct that allows sequencing of short DNA segments.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Before explaining at least one embodiment in detail, it is to beunderstood that the subject matter is not limited in its application tothe details of construction and the arrangement of the components setforth in the following description or illustrated in the drawings. Thesubject matter is capable of other embodiments or of being practiced orcarried out in various ways. Also, it is to be understood that thephraseology and terminology employed herein is for the purpose ofdescription and should not be regarded as limiting. In discussion of thevarious figures described herein below, like numbers refer to likeparts. The drawings are generally not to scale.

For clarity, non-essential elements were omitted from some of thedrawings.

The present subject matter provides a DNA construct that allowssequencing of short DNA segments, for example in a length of hundredsbase pairs, by platforms configured to sequence long DNA fragments, inthe range of substantially 1,000-10,000 bp, and even up to 100,000 bpand more, for example the nanopore sequencing platform.

The present subject matter further provides a DNA construct that allowssequencing of short DNA segments, for example in a length of hundredsbase pairs, multiple times, giving rise to accurate sequencing results,by platforms configured to sequence long DNA fragments, in the range ofsubstantially 1,000-10,000 bp, and even up to 100,000 bp and more, forexample the nanopore sequencing platform.

The present subject matter further provides a DNA construct that allowssimultaneous sequencing of multiple different short DNA segments, forexample in a length of hundreds base pairs, from different origins, byplatforms configured to sequence long DNA fragments, in the range ofsubstantially 1,000-10,000 bp, and even up to 100,000 bp and more, forexample the nanopore sequencing platform, while allowing to identifyeach segment and its origin according to the sequences obtained.

The present subject matter additionally provides a DNA construct thatallows distinguishing between mutations in a target sequence and errorsintroduced into the sequence obtained, for example by poor accuracy ofthe sequencing method and errors introduced during amplification of theROI.

The present subject matter additionally provides a method for preparinga DNA construct that allows sequencing of short DNA segments, forexample in a length of hundreds base pairs, by platforms configured tosequence long DNA fragments, in the range of substantially 1,000-10,000bp, and even up to 100,000 bp and more, for example the nanoporesequencing platform.

The present subject matter further provides a method for preparing a DNAconstruct that allows sequencing of short DNA segments, for example in alength of hundreds base pairs, multiple times, giving rise to accuratesequencing results, by platforms configured to sequence long DNAfragments, in the range of substantially 1,000-10,000 bp, and even up to100,000 bp and more, for example the nanopore sequencing platform.

The present subject matter additionally provides a method for preparinga DNA construct that allows simultaneous sequencing of multipledifferent short DNA segments, for example in a length of hundreds basepairs, from different origins, by platforms configured to sequence longDNA fragments, in the range of substantially 1,000-10,000 bp, and evenup to 100,000 bp and more, for example the nanopore sequencing platform,while allowing to identify each segment and its origin according to thesequences obtained.

The present subject matter further provides a method for preparing a DNAconstruct that allows distinguishing between mutations in a targetsequence and errors introduced into the sequence obtained, for exampleby poor accuracy of the sequencing method and errors introduced duringamplification of the ROI.

The present subject matter further provides a method for analyzing DNAsequences obtained by sequencing of long DNA fragments, for examplenanopore sequencing, while distinguishing between mutations in a targetsequence and errors introduced into the sequence by the method itself,for example errors in sequencing and errors introduced duringamplification of the ROI.

The DNA construct of the present subject matter dramatically improvesthe accuracy of DNA sequencing reads when sequencing short DNA segments,for example in a length of hundreds base pairs, by platforms configuredto sequence long DNA fragments, in the range of substantially1,000-10,000 bp, and even up to 100,000 bp and more, for example thenanopore sequencing platform. This DNA construct, then, may be used indiagnosing genetic variations with high sensitivity and specificity.

The DNA construct comprises a plurality of units.

FIGS. 1A-B schematically illustrate, according to some exemplaryembodiments, a unit of a DNA construct. The unit 1 comprises a segment12, namely a target DNA sequence, the analysis of which is desired. Anytarget DNA sequence known in the art is under the scope of the presentsubject matter, for example a gene or a part of a gene in which amutation is sought for the diagnostics of a gene-based disease, likecancer, genetic disorder and the like. The segment 12 may be in anydesire length. According to one embodiment, the segment is a fewhundreds base pairs long, up to substantially 1,000 bp long. Accordingto a preferred embodiment, the length of the segment 12 is in the rangeof substantially 100-500 bp.

The unit 1 further comprise an index 14 attached to one end of thesegment 12 and an identifier 16 attached to an opposite side of thesegment 12. As can be seen in FIG. 1A, according to one embodiment, theindex 14 is attached to the 5′-end of the segment 12 and the identifier16 is attached to the 3′-end of the segment 12. As can be seen in FIG.1-B, according to another embodiment, the index 14 is attached to the3′-end of the segment 12 and the identifier 16 is attached to the 5′-endof the segment.

According to one embodiment, the index 14 is a DNA sequence that isunique to the origin of the segment 12. A different index 14 sequence isattached to any copy of the segment 12 that originates from a certainorigin. An origin may be for example an individual from which thesegment 12 is obtained. Thus, the index 14 is configured to tag theorigin of the segment 12. According to another embodiment, the index 14is a DNA sequence that is unique for the target sequence of the segment12. A different index 14 sequence is attached to any copy of the segment12 that originates from a certain target sequence. A target sequence maybe for example a certain gene that is diagnosed for mutations, a certaintag sequence and the like. A person skilled in the art shouldunderstand, then, that the index 14 may be simultaneously unique to theorigin and the target sequence of the segment 12. The length of theindex 14 may be any length that allows unique tagging of each origin.For example, the length of the index 14 is substantially 12 bp. Itshould be noted though that this is only an exemplary length and the anylength of the index 14 is under the scope of the present subject matter.

According to one embodiment, the index 14 may be split to two parts andeach part of the index 14 may be attached to any one of the two ends ofthe segment 12. For example, a 12 bp long index 14 is split to a first 6bp index 14 part and a second 6 bp index 14 part. The first index 14part is attached to one end of the segment 12 and the second index 14part is attached to another end of the segment 12.

According to one embodiment, the identifier 16 is a DNA sequence that isunique for every copy of the segment 12. A different identifier 16sequence is attached to each copy of the segment 12. Thus, theidentifier 16 is configured to tag each copy of the segment 12. In amethod described hereinafter, the unit 1 is amplified, for example byPCR and at a later stage the sequences of multiple copies of the unit 1are analyzed. The sequences of the segments 12 of the units 1 thatcomprise the same identifier 16 are considered to be amplified from thesame original segment 12, or original target sequence. Thus, as will bediscussed hereinafter, one can distinguish between mutations in theoriginal segment 12, or the original target sequence and errorsintroduced during the procedure. The length of the identifier 16 may beany length that allows unique tagging of each copy of the segment 12.For example, the length of the identifier 16 is substantially 12 bp. Itshould be noted though that this is only an exemplary length and the anylength of the identifier 16 is under the scope of the present subjectmatter.

According to one embodiment, the identifier 16 may be split to two partsand each part of the identifier 16 may be attached to any one of the twoends of the segment 12. For example, a 12 bp long identifier 16 is splitto a first 6 bp identifier 16 part and a second 6 bp identifier 16 part.The first identifier 16 part is attached to one end of the segment 12and the second identifier 16 part is attached to another end of thesegment 12.

The unit 1 further comprises an introducer 18 at the 5′-end of the unit1 and a closure 19 at the 3′-end of the unit 1. According to theembodiment illustrated in FIG. 1A, the introducer 18 is attached to the5′-end of the index 14 and the closure 19 is attached to the 3′-end ofthe identifier 16. According to the embodiment illustrated in FIG. 1B,the introducer 18 is attached to the 5′-end of the identifier 16 and theclosure 19 is attached to the 3′-end of the index 14.

The introducer 18 and the closure 19 are configured to serve as targetsequences for the annealing of primers of a PCR. For example, theintroducer 18 is configured to anneal with a forward primer and theclosure 19 is configured to anneal with a reverse primer, and viceversa, during PCR cycling. In addition, in a method describedhereinafter, sequences of units 1 sequentially attached one to the otherare analyzed. Since the introducer 18 and closure 19 are placed at theends of the unit 1, they are also configured to indicate the borders ofthe unit 1 sequences.

Segments 12 may be obtained by any method and mechanism known in theart. According to one embodiment, segments 12 may be obtained byshearing of nucleic acids, for example shearing of genomic DNA, totalRNA, mRNA and the like. Any type of nucleic acids shearing known in theart is under the scope of the present subject matter. According to thisembodiment, for preparing the unit 1, the index 14, identifier 16,introducer 18 and closure 19 are attached to the segments 12 by anymethod known in the art, for example by ligating them to the segments 12to obtain the embodiments of the unit 1 illustrated in FIGS. 1A-B. Thisligation gives rise to a pre-mature unit 1.

According to another embodiment, segments 12 may be obtained by a firstPCR, using forward and reverse primers that define the desired sequenceof the segment 12. Thus, the forward primer for the first PCR isspecific to a sequence at the 5′-end of the segment 12 and a reverseprimer for the first PCR is specific to a sequence at the 3′-end of thesegment 12. The template for the first PCR may be any template known inthe art that may be a source of segments 12, for example genomic DNA,cDNA library and the like.

According to one embodiment, after the segments 12 are amplified by thefirst PCR, units are prepared by attaching the index 14, identifier 16,introducer 18 and closure 19 to the amplified segments 12 by any methodknown in the art, for example by ligating them to the amplified segments12 to obtain the embodiments of the unit 1 illustrated in FIGS. 1A-B.This ligation gives rise to a pre-mature unit 1.

According to another embodiment, the pre-mature unit 1 may be preparedduring the first PCR. According to this embodiment, the primers that areused for amplifying the segments 12 comprise tails with sequences of theindex 14, identifier 16, introducer 18 and closure 19.

FIGS. 2A-B schematically illustrate, according to an exemplaryembodiment, a forward primer and a reverse primer, respectively, for afirst PCR. The forward primer 20 for the first PCR, illustrated in FIG.2A, comprises a specific Fwd 122 sequence specific to the 5′-end of thesegment 12, an index 14 sequences tail attached to the 5′-end of thespecific Fwd 122 sequence and an introducer 18, that may be termed alsointroducing 18, sequence tail attached to the 5′-end of the index 14sequence. The reverse primer 30 for the first PCR, illustrated in FIG.2B, comprises a specific Rev 124 sequence specific to the 3′-end of thesegment 12, an identifier 16 sequence tail attached to the 3′-end of thespecific Rev 124 sequence and a closure 19 sequence tail attached to the3′-end of the identifier 16 sequence. A person skilled in the art mayrecognize that the primers illustrated in FIGS. 2A-B give rise,following the first PCR, to the embodiment of the unit 1 illustrated inFIG. 1A. In order to obtain the embodiment of the unit 1 illustrated inFIG. 1B, the primers for the first PCR should be arranged accordingly.

It is designated in FIGS. 2A-B that the range of length of the specificFwd 122 sequence and the specific Rev 124 sequence is in the range ofsubstantially 20-25 bp. It should be noted that this range of length ofthe specific Fwd 122 sequence and the specific Rev 124 sequence is onlyexemplary, and that any length of the specific Fwd 122 sequence and thespecific Rev 124 sequence is under the scope of the present subjectmatter. Similarly, it should be noted that the sequences of the index14, introducer 18 and closure 19, shown in FIGS. 2A-B, are onlyexemplary, and that the index 14, introducer 18 and closure 19, as wellas the identifier 16, may have any possible sequence in any possiblelength.

According to one embodiment, the first PCR comprises a low number ofamplification cycles, in order to avoid introduction of false mutationsin the segment 12 due to poor froofreading by the DNA polymerase used inthe PCR. Any number of cycles of the first PCR that produce a sufficientamount of amplicons, namely pre-mature units 1, to be used as templatesfor a second PCR in one hand, while minimizing the amount of falsemutations introduced by DNA polymerase on the other hand, is under thescope of the present subject matter. An exemplary number of cycles ofthe first PCR is substantially 3-5 cycles.

The pre-mature unit 1 that was made by any way known in the art, forexample according to the aforementioned embodiments—nucleic acidsshearing and ligation, first PCR and ligation and first PCR, serves as atemplate for a second PCR.

According to one embodiment, the forward primer used in the second PCRis specific to the introducer 18 and the reverse primer used in thesecond PCR is specific to the closure 19. It should be noted that theforward and reverse primers of the second PCR do not comprise the index14 and identifier 16 sequences. As a result, the sequences of the index14 and identifier 16 are amplified by the DNA polymerase used in thesecond PCR. Thus, the second PCR is configured to amplify the pre-matureunit 1. The product of the second PCR is a mature unit 1.

According to one embodiment, the mature unit 1, like the pre-mature unit1, is a double-stranded DNA. The 5′-end of each strand of the matureunit 1 is phosphorylated. Any method for phosphorylating the 5′-ends ofthe strands of the mature unit 1 is under the scope of the presentsubject matter. For example, the 5′-ends of the product of the secondPCR may be phosphorylated with an enzyme configured to add a phosphategroup to a 5′-end of a DNA strand. Another example is to use primers ofthe second PCR that are phosphorylated at their 5′-ends.

The DNA construct that allows sequencing of short DNA segments, forexample in the range of substantially 200-300 bp, by platformsconfigured to sequence long DNA fragments, in the range of substantially1,000-10,000 bp, for example the nanopore sequencing platform, isprepared by ligating mature units 1 one to the other, giving rise to along DNA fragment, in the range of substantially 1,000-10,000 bp.

FIG. 3 schematically illustrates, according to an exemplary embodiment,a DNA construct that allows sequencing of short DNA segments. This DNAconstruct is designated hereinafter “DNA construct 100”. The DNAconstruct 100 comprises multiple units 1 sequentially attached one tothe other. The units 1 may be in any direction one relative to theother. The length of the DNA construct 100 is a length suitable forsequencing by sequencing methods configured to sequence long DNAsequences, for example nanopore sequencing. Thus, for example, thelength of the DNA construct 100 may be in the range of substantially1,000-10,000 bp, and even up to substantially 100,000 bp and more. Theunits 1 may be sequentially attached one to the other by any methodknown in the art, for example ligation.

The DNA construct 100 is then sequenced by any method known in the artfor sequencing long DNA fragments, for example in the range ofsubstantially 1,000-10,000 bp, and even up to substantially 100,000 bpand more, like nanopore sequencing, and more particularly OxfordNanopore Technology. The result of this is a nucleotide sequence of theentire DNA construct 100. Any step of the sequencing method until theobtaining of the nucleotide sequence of the DNA construct 100 is underthe scope of the present subject matter. This may include for examplebase calling of the sequences, namely conversion of raw data from thesequencing instrument to nucleotide sequences. This may also includedata cleanup, namely trimming of corrupted sequences and sequences thatare not related to the sequence of the DNA construct 100, for examplevery low quality sequences, or sequence element that belong to thesequencing procedure, for example adaptors that are part of the nanoporesequencing method.

The present subject matter provides a method for analyzing a sequence ofthe DNA construct 100, which as described above may be obtained by anymethod known in the art. The method for analyzing a sequence of the DNAconstruct 1 comprises:

separating sequences of units 1 one from the other. This is done byidentifying sequences of the ends of the units 1 and separating them inbetween;grouping units 1 having the same index 14 for obtaining same index 14groups. In other words, at this stage the units 1 are classifiedaccording to the origins of the target sequences. For example, sequencesof one individual are grouped together because they have the same index14, while sequences of another individual are grouped separately becausethey have another index 14;grouping units 1 having the same segment 12 sequence in each same index14 group, for obtaining same segment 12 group. At the stage, the units 1of each origin, for example individual, are grouped in separate groupsof target DNA, for example separate genes. This is achieved by groupingunits 1 having the same segment sequence in one group;grouping units 1 having the same identifier 16 sequence in each samesegment 12 group for obtaining same identifier 16 groups. At this stage,units 1 obtained from the same copy of segment are grouped in one group.As described above, the pre-mature units comprise various copies of acertain target sequence, namely a certain segment 12. Each copy istagged with a different identifier 16, and then the tagged pre-matureunits 1 are amplified in the second PCR for obtaining mature units 1.Thus, each copy of the target sequence, namely the segment 12, isamplified during the second PCR, and errors may be introduced into thesegment 12 during amplification. In addition, during the sequencing,errors in reading of the sequence of the segment 12 may be obtained.Therefore, this stage of grouping units 1 having the same identifier 16sequence is important since it allows identifying errors in the segment12 sequence due to the procedure and eliminate them, while identifyingmutations in the target sequence that are sought for diagnosticpurposes. It is easy to distinguish between errors in the segment 12sequence due to the procedure and mutations in the target sequence,because mutations in the target sequence are detected in all segments 12tagged with the same identifier 16, while errors due to the proceduremay be detected only in one or few segments tagged with the sameidentifier 16. Thus, after grouping units 1 having the same identifier16 sequence in each same segment 12 group, the next step is;collapsing multiple segment 12 sequences in each same identifier 16group to a single sequence that accurately represents the sequence ofthe target sequence according to which the segment was obtained. Duringthe collapsing, errors in the sequence due to the procedure areeliminated as described above.

According to one embodiment, the method for analyzing a sequence of theDNA construct 100 further comprises after collapsing multiple segment 12sequences in each same identifier 16 group to a singlesequence—comparing the sequences obtained by the collapsing with knownsequences of the target sequences, in order to identify variants in thecollapsed sequences of the target sequences (segments 12) compared tothe known sequences of the target sequences.

According to another embodiment, the method for analyzing a sequence ofthe DNA construct 100 further comprises after comparing the sequencesobtained by the collapsing with known sequences of the targetsequences—reporting mutations found in the variants.

The present subject matter provides a method for preparing the DNAconstruct 100 described above, the method comprising:

obtaining a segment 12;

attaching an index 14, an identifier 16, an introducer 18 and a closure19 to the segment 12, wherein

the index 14 is attached to one end of the segment 12;

the identifier 16 is attached to another end of the segment 12;

the introducer 18 is attached to a 5′-end of either the index 14 or theidentifier 16; and

the closure 19 is attached to a 5′-end of a remaining either identifier16 or index 14, giving rise to a pre-mature unit 1;

amplifying the pre-mature unit with primers specific to the introducerand closure, giving rise to a double stranded mature unit 1;

phosphorylating 5′-ends of the strands of the mature unit 1, and

sequentially attaching mature units.

Embodiments of the unit 1 are illustrated in FIGS. 1A-B and anembodiment of the DNA construct 100 is illustrated in FIG. 3.

According to one embodiment, the obtaining of the segment 12 is byshearing of a poly-nucleic acid.

According to another embodiment, the poly-nucleic acid is a genomic DNA.

According to yet another embodiment, the poly-nucleic acid is total RNA.

According to still another embodiment, the poly-nucleic acid is mRNA.

According to one embodiment, the obtaining of the segment 12 is byamplifying the segment with primers specific to the segment 12.

According to one embodiment, the attaching of the index 14, identifier16, introducer 18 and a closure 19 to the segment 12 is by attaching theindex 14, identifier 16, introducer 18 and a closure 19 to the primersspecific to the segment 12 and amplifying the segment, wherein the index14 is attached to a 5′-end of either a forward or reverse segment 12specific primer, the identifier 16 is attached to a 5′-end of aremaining either reverse or forward segment 12 specific primer, theintroducer 18 is attached to a 5′-end of either the index 14 oridentifier 16 and the closure 19 is attached to a 5′-end of a remainingeither identifier 16 or index 14.

Embodiments of the primers to which the index 14, identifier 16,introducer 18 and a closure 19 are illustrated in FIGS. 2A-B.

EXAMPLES

Primers for First PCR

For each mutation to be tested—specific sequences are located forprimers that allow amplification of a certain target sequence, alsotermed “segment”, while the primers for the first PCR harbor a desiredtested location. Amplicon size is for example substantially 200-400 bplong, the melting temperature (Tm) of the primers is substantially63-65° C. and the primers length is substantially 18-26 bp. An examplefor specific primers (forward and reverse) to the BRAF mutation at aminoacid position V 600:

Fwd - AGCCTCAATTCTTACCATCCAC, Rev - CTTCATAATGCTTGCTCTGATAGG.For each mutation specific sequence of the first stage primers hefollowing unique elements are added.Index: (12 bases) 5′ to the forward specific sequence(example—CGTGATCGTGAT).Introducer: Addition of 24 bases upstream to the Index (forward primerexample—CAAGCAGAAGACGGCATACGAGAT).The length and sequence of the Introducer can vary according to theExternal-Fwd primer sequence.Identifier: 12 random bases (12×N) at the 5′ end of the reverse specificprimer.Closure: Addition of 21 bases upstream to the identifier (reverseprimer—AATGATACGGCGACCACCGAG).The length of the Closure and the sequence itself can vary according tothe Rev primer sequence.

Primers for Second PCR

Primers for the second PCR are designed to work on every first PCRamplicon. The primers comprise:

A Forward primer having a sequence of the Introducer. The Forward primermay comprise a 5′-Phosphate group, and two phosphorothioate (PS) bondsbetween the three 3′ bases.A Reverse primer having a sequence of the Closure. The Reverse primermay comprise a 5′-Phosphate group, and two phosphorothioate (PS) bondbetween the 1^(st) to 2^(nd) and 2^(nd) to 3^(rd) 3′ bases.

First PCR and Second PCR

The procedure comprises a first PCR and a second PCR reaction, each PCRwith unique primers as described above. The first PCR is aimed atpreparing the target region for the second PCR. The second PCR amplifiesonly amplicons produced during the first PCR.

First PCR

The following components are added to a sterile strip tube:

component μl PCR Master Mix 12.5 First stage Primer-Fwd (0.1 nM) 1 Firststage Primer-Rev (0.1 nM) 1 DNA (1-50 ngr) 1-9.5 Nuclease-free water Upto 25Set a 50 μl or 100 μl pipette to 20 μl and then pipette the entirevolume up and down at least 10 times to mix thoroughly. Perform a quickspin to collect all liquid from the sides of the tube. Place the tube ona thermocycler and perform PCR amplification using the following PCRcycling conditions:

TABLE 1 Cycle Step Temp. ° C. Time Cycles Initial Denaturation 95 15minutes 1 Denaturation 94 30 seconds Annealing 65 3-5 Extension 72 Holdfor second stage 10 ∞     1 Denaturation 94 30 seconds Annealing 6520-35 Extension 72 Final Extension 72  5 minutes 1 Hold 4 ∞     1PCR program may be changed and adjusted according to the Polymeraseenzyme used.

Second PCR

When the first PCR program holds at 10° C., carefully add the second PCRprimers (1 μl, 5-10 μM from each primer) and let the PCR programcontinue.

Clean-Up of PCR Reaction

PCR products from previous step are cleaned for further reactions withAMPure XP magnetic beads (Beckman Coulter).

While using AMPure XP Beads, allow the beads to warm to room temperaturefor at least 30 minutes before use and vortex the beads firmly toresuspend. Use the AMPure XP Beads for best practice or manufacturerprotocol for >250 bp size selection:

Add substantially 0.4× (for 25 μl PCR reaction use 10 μl of resuspendedbeads) to the PCR reaction. Mix well by pipetting up and down at least10 times.Incubate samples on bench top for at least 5 minutes at roomtemperature.Place the tube/plate on an appropriate magnetic stand to separate thebeads from the supernatant.After 5 minutes (or when the solution is clear), carefully remove anddiscard the supernatant.Add 200 μl of 80% freshly prepared ethanol to the tube/plate while inthe magnetic stand.Incubate at room temperature for 30 seconds, and then carefully removeand discard the supernatant. Repeat this step for a second ethanol wash.Be sure to remove all visible liquid after the second wash.Air dry the beads for up to 5 minutes while the tube/plate is on themagnetic stand with the lid open.Remove the tube/plate from the magnetic stand. Elute the DNA from thebeads into 15 μl of Nuclease-free water.Mix well on a vortex mixer or by pipetting up and down 10 times.Incubate for at least 2 minutes at room temperature.Place the tube/plate on a magnetic stand. After 5 minutes (or when thesolution is clear), transfer 13-15 μl to a new PCR tube.Measure dsDNA in the tube by using Qubit NanoDrop (or equivalent).Combine equivalent amounts of PCR amplicons/Fragment-Construct from thedifferent panel-PCR tubes.

Amplicon Ligation

Use T4 DNA Ligase (M0202, NEB).

Set up the following reaction in a microcentrifuge tube on ice.

COMPONENT 50 μl REACTION T4 DNA Ligase Buffer (10X)* 5 μlFragment-Construct DNA 0.1-0.5 pmol Nuclease-free water to 50 μl T4 DNALigase 2.5 μl *The T4 DNA Ligase Buffer should be thawed and resuspendedat room temperature. ** T4 DNA Ligase should be added last.Gently mix the reaction by pipetting up and down and microfuge briefly.Incubate at room temperature for 2 hours.Heat inactivate at 65° C. for 10 minutes.

Chill on ice.

Cleanup of Ligation Reaction

Ligation products from previous step are cleaned for further reactionswith AMPure XP magnetic beads (Beckman Coulter).

While using AMPure XP Beads (Beckman Coulter), allow the beads to warmto room temperature for at least 30 minutes before use and vortex thebeads firmly to resuspend. Use the AMPure XP Beads for best practice ormanufacturer protocol for >250 bp size selection:

Add ˜0.1× (for 50 μl PCR reaction use 5 μl of resuspended beads) to thePCR reaction. Mix well by pipetting up and down at least 10 times.Incubate samples on bench top for at least 5 minutes at roomtemperature.Place the tube/plate on an appropriate magnetic stand to separate thebeads from the supernatant.After 5 minutes (or when the solution is clear), carefully remove anddiscard the supernatant.Add 200 μl of 80% freshly prepared ethanol to the tube/plate while inthe magnetic stand. Incubate at room temperature for 30 seconds, andthen carefully remove and discard the supernatant. Repeat this step fora second ethanol wash. Be sure to remove all visible liquid after thesecond wash.Air dry the beads for up to 5 minutes while the tube/plate is on themagnetic stand with the lid open.Remove the tube/plate from the magnetic stand. Elute the DNA from thebeads into 20 μl of of 10 mM Tris-HCl or 0.1×TE.

Mix well on a vortex mixer or by pipetting up and down 10 times.Incubate for at least 2 minutes at room temperature.

Place the tube/plate on a magnetic stand. After 5 minutes (or when thesolution is clear), transfer 17- 20 μl to a new PCR tube.Measure dsDNA in the tube by using Qubit NanoDrop (or equivalent).Combine equivalent amounts of PCR amplicons/Fragment-Construct from thedifferent panel-PCR tubes.

Preparation of Library and Sequencing with Oxford Nanopore Technologies

Follow one of the protocols for library preparation:

Rapid Sequencing Kit, SQK-RAD 004; or Ligation Sequencing Kit 1D,SQK-LSK 108.

Sequence the library by using Oxford Nanopore Technologies platform(MinION, GridION) according to manufacturer's protocol.

Data Analysis

Data analysis is preferably conducted by bioinformatic techniques, andincludes the steps mentioned above.

An Alternative Method for Preparing of Mature Units

Previously, mature units were prepared for ligation by a first PCR and asecond PCR, when the primers for the first PCR included the introducer,index, identifier and closure sequences. Here described is analternative method for preparing mature units for ligation.

Conjugating Indices and Identifiers by Ligation

Conjugating UMI's by Ligation

At this stage specific panel at the target genome is amplified in a PCRreaction, while amplified amplicons aren't tagged with indices andidentifiers. After the first PCR reaction indices and identifiers aswell as introducers and closures are attached by ligation to theamplicons and a second PCR reaction is made in order to amplify theunit.

At This stage the first reaction primer are specific primers for adesired location/panel to be sequenced later and do not comprise anyelements at their 5′-ends.This protocol uses the following reagents as a recommendation but can bereplaced by alternative reagents/compounds:

1. NEBNext® Ultra™ End Repair/dA-Tailing Module (NEB #E7442). 2.NEBNext® Ultra Ligation Module (NEB #E7445).

3. xGen® Dual Index UMI Adapters.

In general, the procedure comprises the following:

PCR amplification of target regions;ligation of adaptors with UMI's;amplification of ligated fragments with secondary primers (primers aredesigned to be hybridize to the 5′ element of the adaptor);ligation of fragments in order to make long DNA strands of conjugatedFragment-Constructs;library preparation for long DNA fragments;data analysis;mutation report.

First PCR—Amplification of Desired Target Sequence

Preform PCR reaction with high fidelity polymerase enzyme and limitednumber of cycles (5-15 according to the amount of starting material).Use the target specific primers.

Cleanup of PCR Reaction (Recommended)

PCR products from the first PCR are cleaned for further reactions withAMPure XP magnetic beads (Beckman Coulter) or any other PCR cleanupprotocol in order to eliminate residual elements from previous stagesuch as primers and buffers.

End Repair/dA-Tailing

Follow the NEBNext® Ultra™ End Repair/dA-Tailing Module (NEB #E7442)protocol:

Mix the following components in a sterile, nuclease-free tube:(green) End Prep Enzyme Mix—3.0 μl;(green) End Repair Reaction Buffer (10×)—6.5 μl;PCR amplicons from previous step—55.5 μl;Mix by pipetting, followed by a quick spin to collect all liquid fromthe sides of the tube.Place in a thermocycler, with the heated lid on, and run the followingprogram:30 minutes @20° C.;30 minutes @65° C.;

Hold at 4° C.

Proceed directly to NEBNext Ultra Ligation Module (NEB #E7445):If DNA input prior to End Repair is <100 ng, dilute the xGen® Dual IndexUMI Adapters 1:10 in 10 mM Tris-HCl pH 7.5-8.0 or 10 mM Tris-HCl pH7.5-8.0 with 10 mM NaCl to a final concentration of 1.5 μM. Useimmediately.Add the following components directly to the End Prep reaction mixtureand mix well:(red) Blunt/TA Ligase Master Mix—15 μl;xGen® Dual Index UMI Adapters—2.5 μl;(red) Ligation Enhancer—1 μl.Mix by pipetting, followed by a quick spin to collect all liquid fromthe sides of the tube. Incubate at 20° C. for 15 minutes in athermocycler.DNA is now ready for size selection or clean-up.

Cleanup of PCR Reaction (Recommended)

PCR products from previous step are cleaned for further reactions withAMPure XP magnetic beads (Beckman Coulter) or any other PCR cleanupprotocol in order to eliminate residual elements from previous stagesuch as primers and buffers.

Second PCR

This stage amplifies the entire unit from previous stage with primersthat hybridize to the exterior elements in the construct (if using xGen®Dual Index UMI Adapters the exterior elements will be the P5 and P7regions). The primers will have 5′ phosphate group for furtherapplication.

Preform the second PCR with high fidelity polymerase enzyme and limitednumber of cycles (5-15 according to the amount of starting material).Use the general amplification primers. If using xGen® Dual Index UMIAdapters use the following primers:

/Phos/CAAGCAGAAGACGGCATACGA, and /Phos/AATGATACGGCGACCACCGA).

Cleanup of PCR Reaction (Recommended)

Products of the second PCR are cleaned for further reactions with AMPureXP magnetic beads (Beckman Coulter) or any other PCR cleanup protocol inorder to eliminate residual elements from previous stage such as primersand buffers.

The mature units that were obtained are ligated as described above.

One of the purposes of the present subject matter is to distinguishbetween errors introduced into a desired target sequence during theprocedure its preparation for sequencing and the sequencing itself andbetween mutations in the target sequence that are sought for the purposeof diagnostics for example. They can be distinguished by sequencingmultiple copies of the same target sequence, that is present in thesegment while being able to identify sequences of copies of the sametemplate, or target sequence. This is achieved by attaching theidentifier 16 to the segment 12. As described above, each copy of thesegment 12 is tagged with a specific identifier 16 before the second PCRand before the sequencing of the DNA construct 100. Therefore, sequencesof the segment 12 that are tagged with the same identifier areconsidered identical in the sequence of the original DNA target, whileany variation in the sequence between them is considered as originatingdue to error in the second PCR and the sequencing procedure.

Another unique feature of the present subject matter is the sequentialattachment of multiple units, tagged with an identifier 16, as describedabove, to form a long DNA construct 10 that is suitable for sequencingin methods that are configured for sequencing of very long DNA fragment,like nanopore sequencing. The other components of the unit 1 assist inthe analysis of the sequences obtained—The introducer 18 and closure 19assist in finding the borders of the units in the sequence; the indexallows identification of the source of the sequence of the segment12—thus allowing analysis of sample from multiple sourcessimultaneously, and the sequence of the segment 12 allow identifying thetarget sequence—thus allowing analysis of multiple target sequencessimultaneously.

It is appreciated that certain features of the subject matter, whichare, for clarity, described in the context of separate embodiments, mayalso be provided in combination in a single embodiment. Conversely,various features of the subject matter, which are, for brevity,described in the context of a single embodiment, may also be providedseparately or in any suitable sub combination.

Although the subject matter has been described in conjunction withspecific embodiments thereof, it is evident that many alternatives,modifications and variations will be apparent to those skilled in theart. Accordingly, it is intended to embrace all such alternatives,modifications and variations that fall within the spirit and broad scopeof the appended claims.

1. A DNA construct comprising multiple units sequentially attached oneto the other, wherein a unit comprises: a segment; an index attached toone end of the segment; an identifier attached to another end of thesegment; an introducer attached to a 5′-end of either the index or theidentifier, and a closure attached to a 5′-end of a remaining eitheridentifier or index.
 2. The DNA construct of claim 1, wherein the lengthof the DNA construct is at least substantially 1,000 bp.
 3. The DNAconstruct of claim 1, wherein the length of the segment is up tosubstantially 1,000 bp.
 4. The DNA construct of claim 1, wherein thelength of the segment is in the range of substantially 100-500 bp.
 5. Amethod for preparing A DNA construct, the method comprising: obtaining asegment; attaching an index, an identifier, an introducer and a closureto the segment, wherein the index attached to one end of the segment;the identifier is attached to another end of the segment; the introduceris attached to a 5′-end either the index or the identifier; and theclosure is attached to a 5′-end of a remaining either identifier orindex, giving rise to a pre-mature unit; amplifying the pre-mature unitwith primers specific to the introducer and closure, giving rise to adouble stranded mature unit; phosphorylating 5′-ends of the strands ofthe mature unit, and sequentially attaching mature units.
 6. A methodfor analyzing a sequence of a DNA construct according to claim 1, themethod comprising: separating sequences of units one from the other;grouping units having the same index for obtaining same index groups;grouping units having the same segment sequence in each same index groupfor obtaining same segment groups; grouping units having the sameidentifier sequence in each same segment group for obtaining sameidentifier groups; collapsing multiple segment sequences in each sameidentifier group to a single sequence that accurately represents thesequence of the target sequence according to which the segment wasobtained.